MIRACLE's Approach to Multilingual Web Retrieval

نویسندگان

Ángel Martínez-González

José Luis Martínez-Fernández

César de Pablo-Sánchez

Julio Villena-Román

Luis Jiménez-Cuadrado

Paloma Martínez

José Carlos González

چکیده

For MIRACLE participation on WebClef 2005, a set of independent indexes was constructed for each top level domain of the EuroGOV collection. Each of these indexes contains information extracted from the document, like URL, title, keywords, detected named entities or HTML headers. These indexes are queried to obtain partial document rankings, which are combined with various relative weights to test the value of each index. The trie based indexing and retrieval engine developed by the MIRACLE team is now fully functional and has been adapted to the WebClef environment and employed in this campaign. Other tools, such as the Named Entities Recognizer based on a finite automaton, have also been developed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MIRACLE's 2005 Approach to Cross-lingual Information Retrieval

This paper presents the 2005 Miracle’s team approach to Bilingual and Multilingual Information Retrieval. In the multilingual track, we have concentrated our work on the merging process of the results of monolingual runs to get the multilingual overall result, relying on available translations. In the bilingual and multilingual tracks, we have used available translation resources, and in some c...

متن کامل

Supporting Multilingual Information Retrieval in Web Applications: An English-Chinese Web Portal Experiment

Cross-language information retrieval (CLIR) and multilingual information retrieval (MLIR) techniques have been widely studied, but they are not often applied to and evaluated for Web applications. In this paper, we present our research in developing and evaluating a multilingual English-Chinese Web portal in the business domain. A dictionary-based approach has been adopted that combines phrasal...

متن کامل

A multilingual text mining approach to web cross-lingual text retrieval

To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our approach will first discover the multilingual concept–term relationships from linguistically diverse textual data relevant to a domain. Second, the multilingual concept–term relationships, in turn, are used to discover the conceptual content of the multilingual text, which is either a document contai...

متن کامل

MIRACLE's 2005 Approach to Geographical Information Retrieval

This paper presents the 2005 MIRACLE’s team approach to Cross-Language Geographical Retrieval (GeoCLEF). The main goal of the GeoCLEF participation of the MIRACLE team was to test the effect that geographical information retrieval techniques cause to information retrieval. The baseline approach is based on the development of named entity recognition and geospatial information retrieval tools an...

متن کامل

Exploiting the Web as the multilingual corpus for unknown query translation

Users’ cross-lingual queries to a digital library system might be short and the query terms may not be included in a common translation dictionary (unknown terms). In this paper, we investigate the feasibility of exploiting the Web as the multilingual corpus source to translate unknown query terms for cross-language information retrieval in digital libraries. We propose a Web-based term transla...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

MIRACLE's Approach to Multilingual Web Retrieval

نویسندگان

چکیده

منابع مشابه

MIRACLE's 2005 Approach to Cross-lingual Information Retrieval

Supporting Multilingual Information Retrieval in Web Applications: An English-Chinese Web Portal Experiment

A multilingual text mining approach to web cross-lingual text retrieval

MIRACLE's 2005 Approach to Geographical Information Retrieval

Exploiting the Web as the multilingual corpus for unknown query translation

عنوان ژورنال:

اشتراک گذاری